Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Many people want Leela to perform well in cold-start positions. However, the network evaluations suffer when there is no history. I propose that if position history is not available, then simply copy planes from the oldest available history.
This PR does this. The effects of making this change (policy evals at position + value evals after bestmove) are shown for the win-at-chess positions as tracked at win-at-chess tracking
See: win-at-chess--history-compare.pdf
I've also taken a look at the effect of rule50 input in cold start start positions.
See: win-at-chess--history--rule50.pdf
It is clear that copying history is a significant improvement for cold start positions.
There is potential for negative impact at the start position, and moves immediately succeeding it. However, looking at the start position and first black move evaluations, the impact seems to be minimal and will quickly be trained away -- and MCTS will mostly result in the same move selection anyway. It appears that there is a flattening effect to the policy, so the immediate impact would simply be slightly more diverse opening choices.
Of course, it would be possible to track the start position node and not fill history if the oldest node in history is the start position. Or simply parameterize the engine. However, consistency is the simplest option.
Here is go nodes 800 evaluation at startpos without history
Here is go nodes 800 evaluation at startpos with copied/fake history